skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Wiens, Jenna"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available May 5, 2026
  2. Free, publicly-accessible full text available May 5, 2026
  3. Free, publicly-accessible full text available December 9, 2025
  4. Abstract ObjectivesTo quantify differences between (1) stratifying patients by predicted disease onset risk alone and (2) stratifying by predicted disease onset risk and severity of downstream outcomes. We perform a case study of predicting sepsis. Materials and MethodsWe performed a retrospective analysis using observational data from Michigan Medicine at the University of Michigan (U-M) between 2016 and 2020 and the Beth Israel Deaconess Medical Center (BIDMC) between 2008 and 2012. We measured the correlation between the estimated sepsis risk and the estimated effect of sepsis on mortality using Spearman’s correlation. We compared patients stratified by sepsis risk with patients stratified by sepsis risk and effect of sepsis on mortality. ResultsThe U-M and BIDMC cohorts included 7282 and 5942 ICU visits; 7.9% and 8.1% developed sepsis, respectively. Among visits with sepsis, 21.9% and 26.3% experienced mortality at U-M and BIDMC. The effect of sepsis on mortality was weakly correlated with sepsis risk (U-M: 0.35 [95% CI: 0.33-0.37], BIDMC: 0.31 [95% CI: 0.28-0.34]). High-risk patients identified by both stratification approaches overlapped by 66.8% and 52.8% at U-M and BIDMC, respectively. Accounting for risk of mortality identified an older population (U-M: age = 66.0 [interquartile range—IQR: 55.0-74.0] vs age = 63.0 [IQR: 51.0-72.0], BIDMC: age = 74.0 [IQR: 61.0-83.0] vs age = 68.0 [IQR: 59.0-78.0]). DiscussionPredictive models that guide selective interventions ignore the effect of disease on downstream outcomes. Reformulating patient stratification to account for the estimated effect of disease on downstream outcomes identifies a different population compared to stratification on disease risk alone. ConclusionModels that predict the risk of disease and ignore the effects of disease on downstream outcomes could be suboptimal for stratification. 
    more » « less
  5. Introduction:Estimating the effects of comorbidities on risk of all-cause dementia (ACD) could potentially better inform prevention strategies and identify novel risk factors compared to more common post-hoc analyses from predictive modeling. Methods:In a retrospective cohort study of patients with mild cognitive impairment (MCI) from US Veterans Affairs Medical Centers between 2009 and 2021, we used machine learning techniques from the treatment effect estimation literature to estimate individualized effects of 25 comorbidities (e.g., hypertension) on ACD risk within 10 years of MCI diagnosis. Age and healthcare utilization were adjusted for using exact matching. Results:After matching, of 19,797 MCI patients, 6,767 (34.18%) experienced ACD onset. Dyslipidemia (percentage point increase of ACD risk range across different treatment effect estimation techniques = 0.009–0.044), hypertension (range = 0.007–0.043), and diabetes (range = 0.007–0.191) consistently had non-zero average effects. Discussion:Our findings support known associations between dyslipidemia, hypertension, and diabetes that increase the risk of ACD in MCI patients, demonstrating the potential for these approaches to identify novel risk factors. 
    more » « less
  6. Current causal inference approaches for estimating conditional average treatment effects (CATEs) often prioritize accuracy. However, in resource constrained settings, decision makers may only need a ranking of individuals based on their estimated CATE. In these scenarios, exact CATE estimation may be an unnecessarily challenging task, particularly when the underlying function is difficult to learn. In this work, we study the relationship between CATE estimation and optimizing for CATE ranking, demonstrating that optimizing for ranking may be more appropriate than optimizing for accuracy in certain settings. Guided by our analysis, we propose an approach to directly optimize for rankings of individuals to inform treatment assignment that aims to maximize benefit. Our tree-based approach maximizes the expected benefit of the treatment assignment using a novel splitting criteria. In an empirical case-study across synthetic datasets, our approach leads to better treatment assignments compared to CATE estimation methods as measured by expected total benefit. By providing a practical and efficient approach to learning a CATE ranking, this work offers an important step towards bridging the gap between CATE estimation techniques and their downstream applications. 
    more » « less
  7. BACKGROUND Timely interventions, such as antibiotics and intravenous fluids, have been associated with reduced mortality in patients with sepsis. Artificial intelligence (AI) models that accurately predict risk of sepsis onset could speed the delivery of these interventions. Although sepsis models generally aim to predict its onset, clinicians might recognize and treat sepsis before the sepsis definition is met. Predictions occurring after sepsis is clinically recognized (i.e., after treatment begins) may be of limited utility. Researchers have not previously investigated the accuracy of sepsis risk predictions that are made before treatment begins. Thus, we evaluate the discriminative performance of AI sepsis predictions made throughout a hospitalization relative to the time of treatment. METHODS We used a large retrospective inpatient cohort from the University of Michigan’s academic medical center (2018–2020) to evaluate the Epic sepsis model (ESM). The ability of the model to predict sepsis, both before sepsis criteria are met and before indications of treatment plans for sepsis, was evaluated in terms of the area under the receiver operating characteristic curve (AUROC). Indicators of a treatment plan were identified through electronic data capture and included the receipt of antibiotics, fluids, blood culture, and/or lactate measurement. The definition of sepsis was a composite of the Centers for Disease Control and Prevention’s surveillance criteria and the severe sepsis and septic shock management bundle definition. RESULTS The study included 77,582 hospitalizations. Sepsis occurred in 3766 hospitalizations (4.9%). ESM achieved an AUROC of 0.62 (95% confidence interval [CI], 0.61 to 0.63) when including predictions before sepsis criteria were met and in some cases, after clinical recognition. When excluding predictions after clinical recognition, the AUROC dropped to 0.47 (95% CI, 0.46 to 0.48). CONCLUSIONS We evaluate a sepsis risk prediction model to measure its ability to predict sepsis before clinical recognition. Our work has important implications for future work in model development and evaluation, with the goal of maximizing the clinical utility of these models. (Funded by Cisco Research and others.) 
    more » « less
  8. Noisy training labels can hurt model performance. Most approaches that aim to address label noise assume label noise is independent from the input features. In practice, however, label noise is often feature or \textit{instance-dependent}, and therefore biased (i.e., some instances are more likely to be mislabeled than others). E.g., in clinical care, female patients are more likely to be under-diagnosed for cardiovascular disease compared to male patients. Approaches that ignore this dependence can produce models with poor discriminative performance, and in many healthcare settings, can exacerbate issues around health disparities. In light of these limitations, we propose a two-stage approach to learn in the presence instance-dependent label noise. Our approach utilizes \textit{\anchor points}, a small subset of data for which we know the observed and ground truth labels. On several tasks, our approach leads to consistent improvements over the state-of-the-art in discriminative performance (AUROC) while mitigating bias (area under the equalized odds curve, AUEOC). For example, when predicting acute respiratory failure onset on the MIMIC-III dataset, our approach achieves a harmonic mean (AUROC and AUEOC) of 0.84 (SD [standard deviation] 0.01) while that of the next best baseline is 0.81 (SD 0.01). Overall, our approach improves accuracy while mitigating potential bias compared to existing approaches in the presence of instance-dependent label noise. 
    more » « less
  9. During training, models can exploit spurious correlations as shortcuts, resulting in poor generalization performance when shortcuts do not persist. In this work, assuming access to a representation based on domain knowledge (i.e., known concepts) that is invariant to shortcuts, we aim to learn robust and accurate models from biased training data. In contrast to previous work, we do not rely solely on known concepts, but allow the model to also learn unknown concepts. We propose two approaches for mitigating shortcuts that incorporate domain knowledge, while accounting for potentially important yet unknown concepts. The first approach is two-staged. After fitting a model using known concepts, it accounts for the residual using unknown concepts. While flexible, we show that this approach is vulnerable when shortcuts are correlated with the unknown concepts. This limitation is addressed by our second approach that extends a recently proposed regularization penalty. Applied to two real-world datasets, we demonstrate that both approaches can successfully mitigate shortcut learning. 
    more » « less